A Multimodal Approach to Automatic Geo-Tagging of Video

نویسندگان

  • Jaeyoung Choi
  • Kannan Ramchandran
چکیده

Geo-tags provide an essential support for organizing and retrieving the rapidly growing online video contents captured by users and shared online. Videos present an unique opportunity for automatic geo-tagging as they combine multiple information sources, i.e., textual metadata, visual and audio cues. This report highlights various approaches (data-driven, semantic technology-based, and graphical model-based) to predict the geo-location of online videos. The algorithms make use of each or combinations of textual, visual and audio information sources. All experiments were performed with a geo-coordinate prediction benchmarking corpus containing 10,438 videos. The performance of these algorithm is analyzed, revealing that the textual metadata is particularly more useful than visual or audio contents, but the combination of multiple cues shows better overall performance. The report concludes with a discussion of the impact that the improvement of geo-coordinate prediction will have and the challenges that remain open for future research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Suggestive Geo-Tagging Assistance for Geo-Collaboration Tools

An argumentation map is an online discussion forum for spatially related topics that combines the forum with an interactive map. The utility of an argumentation mapping tool highly depends on the accuracy and quantity of the geo-tags that link the discussion contributions to geographic locations. These geo-tags can be created manually by the users of the argumentation map or automatically by a ...

متن کامل

Achieving Multimodal Cohesion during Intercultural Conversations

How do English as a lingua franca (ELF) speakers achieve multimodal cohesion on the basis of their specific interests and cultural backgrounds? From a dialogic and collaborative view of communication, this study focuses on how verbal and nonverbal modes cohere together during intercultural conversations. The data include approximately 160-minute transcribed video recordings of ELF interactions ...

متن کامل

Multimodal Automatic Tagging of Music Titles using Aggregation of Estimators

This paper presents the participation to the MusiClef 2012 Multimodal Music Tagging task. It expounds the approach that consists of an aggregation of estimators as a procedure to combine different sources of information.

متن کامل

How Spatial Segmentation improves the Multimodal Geo-Tagging

In this paper we present a hierarchical, multi-modal approach in combination with different granularity levels for the Placing Task at the MediaEval benchmark 2012. Our approach makes use of external resources like gazetteers to extract toponyms in the metadata and of visual and textual features to identify similar content. First, the bounderies detection recognizes the country and its dimensio...

متن کامل

Towards an intelligent framework for multimodal affective data analysis

An increasingly large amount of multimodal content is posted on social media websites such as YouTube and Facebook everyday. In order to cope with the growth of such so much multimodal data, there is an urgent need to develop an intelligent multi-modal analysis framework that can effectively extract information from multiple modalities. In this paper, we propose a novel multimodal information e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012